Using Global Constraints and Reranking to Improve Cognates Detection
نویسندگان
چکیده
Global constraints and reranking have not been used in cognates detection research to date. We propose methods for using global constraints by performing rescoring of the score matrices produced by state of the art cognates detection systems. Using global constraints to perform rescoring is complementary to state of the art methods for performing cognates detection and results in significant performance improvements beyond current state of the art performance on publicly available datasets with different language pairs and various conditions such as different levels of baseline state of the art performance and different data size conditions, including with more realistic large data size conditions than have been evaluated with in the past.
منابع مشابه
Bilingual lexicon extraction from comparable corpora for closely related languages
In this paper we present a knowledge-light approach to extract a bilingual lexicon for closely related languages from comparable corpora. While in most related work an existing dictionary is used to translate context vectors, we take advantage of the similarities between languages instead and build a seed lexicon from words that are identical in both languages and then further extend it with co...
متن کاملDiscriminative Reranking for Semantic Parsing
Semantic parsing is the task of mapping natural language sentences to complete formal meaning representations. The performance of semantic parsing can be potentially improved by using discriminative reranking, which explores arbitrary global features. In this paper, we investigate discriminative reranking upon a baseline semantic parser, SCISSOR, where the composition of meaning representations...
متن کاملSINAI at NTCIR-9 GeoTime: a filtering and reranking approach based solely on geographical entities
Geographic Information Retrieval (GIR) is an active and growing research area that focuses on the retrieval of textual documents according to a geographical criteria of relevance. In recent years, the IR research community has paid particular attention in IR systems that take into account temporal constraints. Temporal Information Retrieval (TIR) is a recent field which addresses the combinatio...
متن کاملJRS at Search and Hyperlinking of Television Content Task
This paper describes the work done by the JRS team for the linking sub-task. We submitted eight pairs of runs: four with different textual resources only, two using reranking based on visual similarity, and two using concept detection results. Each of the pairs contains of one run using the anchor segment only, and one using a longer context segment. The results show higher variance between anc...
متن کاملAutomatic Detection of Cognates Using Orthographic Alignment
Words undergo various changes when entering new languages. Based on the assumption that these linguistic changes follow certain rules, we propose a method for automatically detecting pairs of cognates employing an orthographic alignment method which proved relevant for sequence alignment in computational biology. We use aligned subsequences as features for machine learning algorithms in order t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017